{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "Autograd: 自动求导\n",
    "===================================\n",
    "\n",
    "PyTorch的核心是``autograd``包。\n",
    "我们首先简单的了解一些，然后用PyTorch开始训练第一个神经网络。\n",
    "\n",
    "\n",
    "\n",
    "``autograd``为所有用于Tensor的operation提供自动求导的功能。\n",
    "\n",
    "我们通过一些简单的例子来学习它基本用法。\n",
    "\n",
    "Tensor\n",
    "--------\n",
    "\n",
    "``torch.Tensor`` 是这个包的核心类。如果它的属性\n",
    "``.requires_grad``是``True``，那么PyTorch就会追踪所有与之相关的operation。当完成(正向)计算之后，\n",
    "我们可以调用``.backward()``，PyTorch会自动的把所有的梯度都计算好。与这个tensor相关的梯度都会累加到它的\n",
    "``.grad``属性里。\n",
    "\n",
    "如果不想计算这个tensor的梯度，我们可以调用``.detach()``，这样它就不会参与梯度的计算了。\n",
    "\n",
    "\n",
    "为了阻止PyTorch记录用于梯度计算相关的信息(从而节约内存)，我们可以使用 ``with torch.no_grad():``。\n",
    "这在模型的预测时非常有用，因为预测的时候我们不需要计算梯度，否则我们就得一个个的修改Tensor的`requires_grad`属性，\n",
    "这会非常麻烦。\n",
    "\n",
    "关于autograd的实现还有一个很重要的类``Function``.\n",
    "\n",
    "``Tensor``和``Function``相互连接从而形成一个有向无环图。, 这个图记录了计算的完整历史。\n",
    "每个tensor有一个``.grad_fn``属性来引用创建这个tensor的``Function`` that has created\n",
    "the ``Tensor`` (除了用户直接创建的Tensor，这些Tensor的``grad_fn``是None)。\n",
    "\n",
    "如果你想计算梯度，可以对一个``Tensor``调用它的``.backward()``方法。如果这个Tensor是一个scalar(只有一个数)，\n",
    "那么调用时不需要传任何参数。如果Tensor多于一个数，那么需要传入和它的shape一样的参数，表示反向传播过来的梯度。\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "创建tensor时设置属性requires_grad=True，PyTorch就会记录用于反向梯度计算的信息。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([[1., 1.],\n",
      "        [1., 1.]], requires_grad=True)\n"
     ]
    }
   ],
   "source": [
    "x = torch.ones(2, 2, requires_grad=True)\n",
    "print(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "通过operation产生新的tensor：\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([[3., 3.],\n",
      "        [3., 3.]], grad_fn=<AddBackward>)\n"
     ]
    }
   ],
   "source": [
    "y = x + 2\n",
    "print(y)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "``y``是通过operation产生的tensor，因此它的``grad_fn``不是None。\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<AddBackward object at 0x7f4f2873d550>\n"
     ]
    }
   ],
   "source": [
    "print(y.grad_fn)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "再通过y得到z和out\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([[27., 27.],\n",
      "        [27., 27.]], grad_fn=<MulBackward>) tensor(27., grad_fn=<MeanBackward1>)\n"
     ]
    }
   ],
   "source": [
    "z = y * y * 3\n",
    "out = z.mean()\n",
    "\n",
    "print(z, out)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "``.requires_grad_( ... )``函数会修改一个Tensor的``requires_grad``。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "False\n",
      "True\n",
      "<SumBackward0 object at 0x7f4f28751518>\n"
     ]
    }
   ],
   "source": [
    "a = torch.randn(2, 2)\n",
    "a = ((a * 3) / (a - 1))\n",
    "print(a.requires_grad)\n",
    "a.requires_grad_(True)\n",
    "print(a.requires_grad)\n",
    "b = (a * a).sum()\n",
    "print(b.grad_fn)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "梯度\n",
    "---------\n",
    "现在我们里反向计算梯度。\n",
    "因为``out``是一个scalar，因此``out.backward()``等价于``out.backward(torch.tensor(1))``。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "out.backward()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "打印梯度d(out)/dx\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([[4.5000, 4.5000],\n",
      "        [4.5000, 4.5000]])\n"
     ]
    }
   ],
   "source": [
    "print(x.grad)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们手动计算来验证一下。为了简单，我们把out记为o。\n",
    "$o = \\frac{1}{4}\\sum_i z_i$,\n",
    "$z_i = 3(x_i+2)^2$ 并且 $z_i\\bigr\\rvert_{x_i=1} = 27$.\n",
    "因此，\n",
    "$\\frac{\\partial o}{\\partial x_i} = \\frac{3}{2}(x_i+2)$，因此\n",
    "$\\frac{\\partial o}{\\partial x_i}\\bigr\\rvert_{x_i=1} = \\frac{9}{2} = 4.5$.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们也可以用autograd做一些很奇怪的事情！比如y和x的关系是while循环的关系(似乎很难用一个函数直接表示y和x的关系？对x不断平方直到超过1000，这是什么函数？)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([ 267.9040, 1054.1506,  607.1303], grad_fn=<MulBackward>)\n"
     ]
    }
   ],
   "source": [
    "x = torch.randn(3, requires_grad=True)\n",
    "\n",
    "y = x * 2\n",
    "while y.data.norm() < 1000:\n",
    "    y = y * 2\n",
    "\n",
    "print(y)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([ 51.2000, 512.0000,   0.0512])\n"
     ]
    }
   ],
   "source": [
    "gradients = torch.tensor([0.1, 1.0, 0.0001], dtype=torch.float)\n",
    "y.backward(gradients)\n",
    "\n",
    "print(x.grad)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们可以使用``with torch.no_grad():``来停止梯度的计算：\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "True\n",
      "True\n",
      "False\n"
     ]
    }
   ],
   "source": [
    "print(x.requires_grad)\n",
    "print((x ** 2).requires_grad)\n",
    "\n",
    "with torch.no_grad():\n",
    "\tprint((x ** 2).requires_grad)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**深入阅读：**\n",
    "\n",
    "``autograd``和``Function``的文档在 http://pytorch.org/docs/autograd\n",
    "\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "py3.6-env",
   "language": "python",
   "name": "py3.6-env"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
